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Abstract 

Switching ARMA models greatly enhance the standard linear models to the 
extent that different ARMA model is allowed in a different regime, and the 
regime switching is typically assumed a Markov chain on the finite states of po- 
tential regimes. Although statistical issues have been the subject of many recent 
papers, there is few systematic study of the probabilistic aspects of this new 
class of nonlinear models. This paper discusses some basic issues concerning 
this class of models including strict stationarity, influence of initial conditions, 
and second-order property by studying SVAR models. A number of examples 
are given to illustrate the theory and the variety of applications. Extensions to 
other models such as mean-shifting, and inhomogeneous transition probabilities 
are discussed. 
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1 Introduction 



Switching ARMA models belong to a new class of time series models which are capa- 
ble of capturing various nonlinear aspects of time series data such as nonnormality, 
asymmetry, irreversibility, and variable predictability [e.g. Hamilton 1989; Huges 
and Guttorp 1994; Krolzig 1997; Lu and Berliner 1997]. This class of models extends 
the ARMA linear system to the extent that different ARMA model is allowed in a 
different regime, and the regime switching is typically assumed a Markov chain on 
the finite states of potential regimes. While statistical aspects of fitting these models 
have been much discussed as summarized by Krolzig (1997); There is, however, few 
systematic study of the probabilistic aspects of switching ARMA models, such as 
stationarity or ergodicity. 

This paper discusses some general conditions that ensure stationarity and other 
probabilistic properties such as existence of moments. A general theory due to Brandt 
(1986) is reviewed (Section 2.2). A theory of stability (or, of the noninfluence of 
initial conditions) of switching vector autoregressive models (SVAR) is developed 
(Section 2.3). Some interesting examples are given to illustrate the subtle general- 
ity of the developed stationarity conditions and the variety of applications of the 
switching vector autoregressive models. For example, we exhibit (as in Hoist et al 
(1994)) that unstable subprocesses and stable processes can be mixed to produce a 
stationary process (Example 2), two unstable subprocesses can still be mixed to be 
stationary (Example 4), and stable subprocesses may not always produce stationary 
mixed process, and a counter-example is given (Example 3). The second-order theory 
of switching AR models is developed (Section 4). We also discuss the mean shifting 
models (Section 3.3), switching moving average, and switching ARMA models (Sec- 
tion 5). 
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2 General theory 



2.1 Switching vector AR models 

A general model is the following vector stochastic difference equation 

X n = A n X n _i + E n , n <E Z, (2.1) 

where X n e 3? p and A n is a p x p matrix and E n 6 5R P is a noise vector. Various 
additional structure will be imposed on A n , E n later. For example, an AR(p) process 
can be represented as (2.1) in which A n is a constant matrix assuming a special struc- 
ture. When {(A n ,E n )} is iid, (2.1) is called the Random Coefficient Autoregressive 
(RCA) model (Nicholls and Quinn, 1992). Since in large part such a system is used 
for modelling stationary time series data, stationarity property is a priority in the 
study of probabilistic aspects of such random dynamical systems. A theory for the 
general stochastic equations (2.1) is reviewed in Section 2.2. However, one of our 
objectives is to study the so-called Markov switching vector AR(1) model (SVAR(l)): 
Suppose there are r potential regimes, say S — {1, 2, . . . , r} and I n is a Markov chain 
taking on values in S. Define 

A n = j2Bd{i n =n, (2.2) 
i=i 

where B 1 , . . . , B r are r unknown or partially unknown p x p matrices; and 

r 

E n = X ^i £ nd{I n =i} (2.3) 
i=l 

where {e n i} are independent processes, each subsequence is iid within itself, having 
zero mean and identity covariance matrix. In addition, we make the assumption of 
independence, that {/„} is independent of noise processes {e n i,e n2 , ■ ■ ■ ,£ nr }- We also 
assume that {/„} is irreducible and aperiodic, thus ergodic. 

First we ask the question whether there exists a strict stationary solution for 
(2.1)? Since a strict stationary process may not have any moment existing, this is a 
fairly weak assumption. Though necessary and sufficient stationarity conditions for 
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RCA models are available [e.g. Nicholls and Quinn (1992)], necessary and sufficient 
stationarity conditions when {A n } is dependent has not yet been given (see, however, 
Bougerol and Picard 1992). However, very general sufficient condition that ensure 
stationarity can be formulated from the proof of Brandt (1986), and is first made 
known in Bougerol and Picard (1992). For convenience of later use, we will restate 
a general theorem related to this theory. Here it will be assumed that the super- 
process {(A n , £ l n )}^_ 00 are (jointly) stationary matrices and vectors. It appears 
that all known results in this area make this convenient assumption, though more 
can be said in our setup (later). 

2.2 Brandt's result 

We first state a general result giving sufficient conditions for strict stationarity. Here, 
we do not need to assume that A n takes on discrete values B^s. In the case of 
SVAR(l), stationarity of {A n , E n } is equivalent to assuming that the ergodic chain 
{/„} starts from the remote past or I takes on the stationary distribution. 

The tool is the theory of Lyapunov exponents or product of random matrices. A 
technical assumption that ensures existence of Lyapunov exponents for a stationary 
sequence of random matrices A ± , A 2 , . . . , A n , . . . is 

Emax(log||Ai||,0) < oo. (2.4) 

This is obviously satisfied if A 1 takes on only finite number of values as in the case 
of SVAR(l). 

Under (2.4) the (largest) Lyapunov exponent is defined as 

A= ffin(l/n)log|K...A 1 || (2.5) 

which holds almost surely. 

Furthermore, if the process is ergodic, the Lyapunov exponent is constant and 

A = inf{(l/n)Elog||A l ...A 1 ||,7i> 1}. (2.6) 
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The existence of the limit in (2.5) and (2.6) can be justified using Kingman's subad- 
ditive ergodic theorem. Similarly, the following limit theorem holds with the same 
Lyapunov exponent as a consequence of (2.6): 

A= Iim(l/n)log||AoA_i...A_ n+ i||. (2.7) 

n— too 

Note that A is defined independent of the particular matrix norm used. 

More generally, under stationarity and (2.4), one can apply Oseledec's multiplica- 
tive ergodic theorem to define a spectrum of Lyapunov exponents Ai = A > A2 > 

. . . > v 

Aj = lim — logSAn). holds almost surely for 1 < % < p, (2.8) 

n— >oo fl 

where Si(n) > . . . > S p (n) are the singular values of A n A n ^i . . . A\. Under ergodicity, 
Aj's are constants, independent of the particular realization in {A n }. 

Proposition 1 Given that the super-process {A n , E n } is stationary and ergodic. Sup- 
pose that P(A = 0) > or the following conditions are met: (2.4) holds and the 
Lyapunov exponent for {A n } is negative; that is 

(NL): A= lim(lA)log||AoA_ 1 ...A_ m || < (2.9) 

and the noise satisfies 

Emax(log||£i||,0) < 00. (2.10) 

Then (1) 

00 

W n = E n + Y,A n A n ^ . . . A n ^E n ^ (2.11) 

i=0 

is the only proper stationary solution of (2.1) for the given {A n ,E n }. (ii) The sum 
on the right-hand side of (2.11) converges absolutely almost surely. (Hi) Furthermore, 

P( lim |X n (x) - W n \ = 0) = 1, (2.12) 

for arbitrary random variable X_ m _i = x at time — m — 1 (defined on the same 
probability space as {A n ,E n }), in particular 

X n (x) ^ W , asn^ +00. (2.13) 
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Part (i) of this result is first given in Bougerol and Picard (1992). The proof is 
similar to the one-dimensional case as proved by Brandt (1986) under a stronger as- 
sumption. See Bougerol and Picard (1992) for more details and a necessary condition. 

For convenience of readers, we prove the following lemma for part (ii). 

Lemma 1 // the stationary super-process {A n ,E n } satisfies (2.9) and (2.10), the 
RHS of (2.11) converges absolutely almost surely. 

Proof. First, by (2.9) and (2.10) 

lim sup(l/(i + 1)) log || A n A n _ x . . . A^E^^ \\ 

< limsup(l/(i + 1)) log |KA n _! . . . A n ^\\ + (l/(i + 1)) log ||£„_i_i|| 

i— »oo 

< A + < 0, a.s. 
which implies that 

lim sup || A n A n —i . . . A n _iE n ^i—\ || ^ ' < 1 a.s.. 
Thus, the RHS of (2.11), which is bounded by 

oo 

ll-^rall + • • • Ai-i-E'n-i-l || i 

i=0 

is absolutely convergent almost surely by virtue of Cauchy's root criterion. □ 

Since the process W n defined by (2.11) is a well-defined moving average function 
of ergodic stationary process {A n ,E n }, it follows that it is stationary and ergodic. 
Thus, W n is a MA(oo) process with random coefficients. 

A key idea in the proof of Proposition 1 is based on the following expansion which 
holds for any integers m and n as implied by the recursive nature of (2.1) 

X n (x) = A n A n _! . . . A! AqA-! . . . A_ m x 

n+m— 1 

+ AA-i-VA-i-i + E n (2.14) 

i=0 
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where X n (x) can be interpreted as the state at time n of the system governed by (2.1) 
if it starts at time — m — 1 with the random initial state X_ m _! = x. Thus, W n can 
be regarded as the limit of X n (x) starting from the remote past. 

Further, part (iii) of the theorem says that X n (x) converges to W n forward in time 
as time n tends to the future. This follows from that 

X n (x)-W n = A n A n _ x . . . A x AqA_ x . . . A_ m x 

oo 

+ J]] A n A n — i . . . A n _iE n _i_i 

i=n+m 

which tends to zero almost surely under condition (NL) thanks to Lemma 1. 

Remark 1. Since for any positive random variable X, by Jensen's inequality 
£TogX < log -EX holds whenever EX < oo, it follows that whenever EX a < oo 
for any a > we have ETogX < oo and hence i?max(0, logX) < oo. (Note that 
max(0, logX) represents the positive part of logX.) 

Next, we consider the more realistic situation that a Markov switching process 
starts from a finite time in the past and discuss when such a process can be stationary 
and ergo die. 

2.3 Stability of SVAR models 

Under (2.4) the (largest) Lyapunov exponent is defined as in 2.5 

A= lim (1/n) log IIA, . . .AAl almost surely. (2.15) 

n— >oo 

Now consider the situation that the SVAR(l) process starts at some fixed time, say 
time 0, with some arbitrary starting value X and the regime process {/„} starts from 
an arbitrary distribution J . Let X n (X , I n (I )) denote the process evolved according 
to (2.1) with starting value X and starting regime J at time 0. The question arises 
as to what's the influence of the initial condition or the transient effect. Naturally, 
one would hope that the initial effect will eventually be washed out or vanish. It is 
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indeed so. We prove it in the next theorem after illustrating a lemma. The result of 
this lemma is well known (e.g., Bhattacharya and Waymire (1990), p. 197) however 
we put it here for the sake of completeness. 

Lemma 2 Let 7 1 (-) and I 2 (-) be two independent replicas of an irreducible and ape- 
riodic Markov chain /(•) with finite state space S (with r number of elements), having 
the same transition probability ( P = ((pij)) )■ Define, 

r = inf{k > : I\ = I 2 } . 

Then, for any i,j G S, P{r > n \ Iq — i, 1% — j) converges to zero, exponentially 
fast, as t — > oo. 

Proof. Define, 

p(r ) = maxP(4 ^ I 2 m , 1 < m < r , | i] = k, I 2 = I) 

k,l 

Since the state space is finite, under the condition of irreducibility and aperiodicity, 
it is clear that, there exists an r > 1 such that pf^ > for all i,j G S. Let 
a = mm id p^°\ Then a > and p(r ) < max fcj/ P(/ r 1 o ^ I 2 () , \ l£ = k,I$ = I) = 
max M (l - Ei P{I\ = i\ Io = k))P(I* = i \ I 2 = I) = max fe ^(l - EiPkiPu) < 
(1 — ral) < 1 Let n > 1 be an integer. Then, using Markov property and stationarity 
of the joint Markov chain (I 1 , 1 2 ) we obtain, 
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P(I l m ^li,l<m<nr \ll = iJl= ] ) 
= E ml ^ IL 1 < m < (n - l)r , l\ n _ 1)ro = k, lf n _ 1)ni = I | I 1 , 

k^l 

x^m ^ ^ (« - l)n> < m < nr | Jf n _ 1)ro = fc, /f„_ 1)ro = Z) 
= £ ^(4 ^ 1 < m < (n - l)r , /^i),,, = '?»-i)n, = ' I J o 

k^l 

xP(P m ^P m ,l<m<r \P =k,I 2 =l) 
< E P(4 ^ 1 < m < (n - l)r , Jj^ = *, /?„_i )ro = 1 

I J o = «, J o = i) x P( r o) 
= P{P m ^Il,l<m<(n- l)r | P = i, J 2 = j) x p(r ) . 

Using the above argument recursively we get 

P{I l m ^ 1 < m < nr \ I 1 , = i, J 2 = j) < p n (r ) . 
Consequently, we obtain, for any n > r , 

?(r>n|/ 1 = l ,/ 2 = j ) 

= P{P m ^Il,l<m<n\Il = iJl=j) 

< P(P m ^I 2 m ,l<m< [n/r }r \ P = i, J 2 = j) 

< p[ n / r °l(r ) , 

where [t] = the largest integer that is less than or equal to t. Hence the result. □ 

Theorem 1 As in the condition (NL) assume that under (2.4) the (largest) Lyapunov 
exponent X, defined as, 

A := lim (1/n) log \\A n . . . Ai\\ < almost surely. (2-16) 

Under this assumption the SVAR process is stable, i.e., it has unique asymptotic 
distribution that is free from the influence of the initial distribution. 
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Proof. Let us assume first that {I n } starts at Iq which is the stationary distribution 
for the ergodic chain. Then, it follows that, {A n , E n } are stationary. Hence 

n-l 

X n (X , I n (Io)) = A n A n _i . . . AiXq + A n A n _i . . . A n _jE n _j_i + E n 

j=0 

71-1 

= A n A n ^i . . . AiX Q + A i+ iAi . . . A 1 E + E in distribution. 

(2.17) 



i=0 



Then for any fixed % > 0, 

limsup(l/(i + 1) log \\A i+1 Ai . . . AiE \\ 

< limsup(l/(i + 1)) log ••• All + (l/(i + 1)) log H^oll 

< A + < 0, a.s. (2.18) 

which implies that 

limsup\\A i+1 A i ...A 1 E \\ 1/( - i+1) < 1 a.s.. 

i— »oo 

Thus, the RHS of (2.17) is bounded by 

oo 

\\E \\+J2\\Ai+iA i ...A 1 E \\, 

=o 

which is absolutely convergent almost surely by Cauchy's root criterion and 
|| A n A n _i . . . AiXq\\ — > as n — > oo for any X as in (2.18). Therefore, X n (X , I n (Io)) 
converges in distribution as n — > oo whenever J starts from the stationary distribu- 
tion. 

Let us now observe, 

X n (X , I n (Io)) — X n (X' , I n (Io)) = A n (X n _i(X , I n (Io)) — X n _i(X' , I n (I ))) 

= ■•■ = A n A n - 1 ...A 1 (X -X' ) (2.19) 

Thus, we obtain, 

(1/ra) \og(\X n (X , I n (I )) - X n {X' , J n (J„))|) 
10 



n 



< (1/n) E MKII) + (!/") Iog(|(X - ^)|) 



and hence by strong law for {A,}'s, and under the condition (2.16), we obtain that the 
distance between X n (X , I n (I )) and X n (X' , I n (I )) converges to zero, almost surely, 
exponentially fast regardless of I as n tends to infinity. 

To see that X n (X , I n (Io)) and X n (X' , I n (I'^)) have same asymptotic distribution, 
it is important to notice that, for I n (Io) and I n {I'o) two independent finite state ergodic 
Markov chain starting at I and I' respectively, will meet at some finite stopping time, 
say r, (whose all moments are also finite) with probability one. 



i.e., I n (I' Q ) follows the chain I n (I' ) in the beginning and switches to I n (Io) at the 
stopping time r moves along the same path thereafter. Since I n (Ib) and the I n (Ib) 
have same initial distribution and the transition law and hence they have the same 
distribution. Hence for any bounded and Lipschitzian / we get, 



\Ef(X n (X , J n (J ))) - Ef(X n (X' , I n (I' )))\ 
= \Ef(X n (X ,I n (I ))) - Ef(X n (Xb,I n (Ib)))\ 

= \E([f(X n (X , J n (/ ))) - f(X n (Xb, In(lb)Wr<m) 

+E([f(X n (X ,I n (I ))) - f(X n (X' ,I n (l' )Wr>m)\ 
< \E[E([f(X n (X J n (Io))) - f(X n (X' J~ n (lb)Wr<m I J>J]| 

+2||/||P(r > m), (2.20) 



where T m = r Am and Tj is an appropriate filtration, with respect to which {I n s, X n s} 
are adapted. We restrict the class of / such that the lipschitzian constant is bounded 
by one and the ||/|| < 1 and call that restricted class as BL. Then by Markov 
property we get, for m < n, 



Define, 




I„(Ib), for n < r 
I n {I ), for n>r, 



\E([f(X n (X ,I n (I ))) - f(X n (Xb,I n (Ib)))]I T < m I JvJ]| 
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= \E[f(X n _ Tm (zJ n - T JJ))) - f(X n _ Tm (z'J n - T JJ)))}\ 
< E(\X n _ Tm (z,I n _ Tm (J))-X n _ Tm (z',I n _ Tm (J))\^2) 



(2.21) 



conditionally on z = X Trn (X , J Tm (J )), z' = X Tm (X' , J Tm (i£)) and J = I Tm (I ). Since, 
by earlier argument, for each z, z', J, \X n _ Tm (z,I n _ Tm (J)) - X n _ Tm (z', 4_ Tm (J))| 
goes to zero almost surely, exponentially fast, as n — > oo, by Lebesgue's dominated 
convergence theorem E(\X n _ Tm (z, J n _ Tm (J)) - X n ^ Tm (z', I n _ Tm (J))\ A 2) -> 0, as n -> 
oo, almost surely, for each fixed m > 1. Therefore, again using Lebesgue's dominated 
convergence theorem and the fact that r is finite with probability one (by Lemma 2), 
we obtain, first by taking limit n — > oo and then m — > oo, 



|E/(X n (X ,/ n (/ )))-E/(X n (X;/ n (^)))| 
< |E[E([/(X n (Xo,/ n (/ )))-/(X n (X;/„(^)))]/ r < m | ^ T J]|+2||/||P(r>m) 



uniformly over bounded Lipschitzian / in BL. Since the class of BL characterizes 
the weak convergence, and hence the theorem (for an analogous result in continuous 
time, see Basak, Bisi and Ghosh (1999)). □ 

Corollary 1 Under a useful and simpler condition where the random matrix Ai sat- 
isfies 



for a given norm \\ ■ \\, the SVAR process is stable, i.e., it has unique asymptotic 
distribution that is free from the influence of the initial distribution. 

Proof. By definition (2.6), condition (CB) implies the negative Lyapunov condi- 
tion (NL) in Proposition 1 for any norm. Hence the proof. 

Remark. Brandt (1986) focuses mainly on (CB). However, being independent of 
a matrix norm, condition (NL) of Proposition 1 is more natural in multidimensional 

systems. 



(2.22) 



(CB): 



E log Pi|| < 0. 



(2.23) 
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Remark. It is clear that, if the assumption of irreducibility is dropped then one 
needs to restrict attentions within the irreducible subclasses. Within each irreducible 
subclass the above result is true under aperiodicity. Also, it is easy to see, if the 
assumption of aperiodicity is dropped then the above theorem fails, i.e., asymptotic 
distribution would have the influence of initial distribution. 

Importance of Theorem 1 is in realizing the fact that in practice, we don't have 
data that starts from — oo or follows a nice initial distribution (such as the stationary 
distribution), rather we have data which starts from a finite time in the past and with 
an arbitrary initial distribution, usually unknown. In such a case, having a common 
limiting distribution in forward time is a necessity in making inference of the data. 

Certainly, the question remains in determining the rate of convergence to the 
limiting distribution. A more interesting and challenging problem is to check for sta- 
bility using the Lyapunov exponent approach. For this, a theoretical question arises: 
whether the analogue of Kingman's subadditive ergodic theorem or more Oseledec's 
multiplicative ergodic theorem is true when the sequence of random matrices {A n } 
follows a Markov chain and the initial value is arbitraryl We think this is likely the 
case (recall the law of large numbers for Markov chain) but haven't seen any known 
result on this. 

3 Examples 

Proposition 1 gives a general criterion for checking stationarity of switching autore- 
gressive models via negativity of the largest Lyapunov exponent. Theorem 1 proves 
the more relevant stability property under a stronger condition. Technique for calcu- 
lating Lyapunov exponents for a sequence of random matrices becomes very important 
in checking for stationarity. Unfortunately, it is extremely difficult to have explicit 
formula of Lyapunov exponents except in very special cases, and in the general case 
we may have to resort to numerical method. 
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3.1 Cases when A^s commute 

In the special cases when formula for Lyapunov exponents is available, condition for 
stationarity follows immediately. Some situations are discussed next. Let Ai,A 2 , . . . 
be an ergodic stationary sequence of p x p random matrices and denote A k = {a^ (k)). 

Lemma 3 (i) If A^'s are upper triangular, i.e. = for any i > j, and assume 
that £'max(0, log |ajj|) < oo for all 1 < i < p. Then the Lyapunov exponents exist, 
and they correspond to the ordered sequence of the r quantities defined by 
1 n 

lim -5^1og|a»(A;)| = #log |a«(l)|, fori = l,...,p. 
n ^°° n k=i 

(ii) If any pairs of matrices Ak's commute, let Si(l) > ... > S p (l) be the ordered 
eigenvalues of A 1 and assume £'max(0, log |an|) < oo. Then, the Lyapunov exponents 
exist and are given by Aj = Elog for i — 1, . . . ,p. 

We now specialize the preceding theory to the switching AR model (2.1) when 
A n takes on one of the r possible matrices B ± , . . . ,B r . Obviously, if the sequence 
Ai,A 2 ,... is stationary, Lyapunov exponents always exist, because (2.4) holds au- 
tomatically. In particular, let the stationary distribution of I n be p such that 
P(I n = i) — > Pi for 1 < % < r and p x + ■ ■ ■ + p r = 1- Let E denote the expecta- 
tion over the joint product space of {/„} and {e ni , i — 1, . . . , r} under p. Then, (2.6) 
implies that 

r 

A<5> lo gll£y- (3.i) 
i=i 

Thus, if there exists a norm such that \\Bi\\ < 1 for 1 < i < r where inequality holds 
for at least one i, then the negative Lyapunov condition is satisfied. If 

£'max(0, log ||£ii||) < oo for 1 < i < oo, (3.2) 

by Proposition 1, the Markov switching AR model with at most random walk type 
nonstationarity in subprocesses and at least one stable subprocess is stationary. By 
now we have used the term stable process or stability in several places. What we mean 
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is the processes starting from different initial conditions converge. In the case of a 
vector AR(1) process, this is equivalent to the coefficient matrix A having eigenvalues 
whose norms are all less than one. And the latter coincides with the stationarity 
condition (cf. Example 1). 

Example 2. In the one-dimensional case, negative Lyapunov condition reduces to 
Elog \a n \ < 0. In particular, if a n takes on finite numbers b±, . . . , b r , this is 

r 

J2\og\h\Pr(a n = h)<0. (3.3) 
i=i 

This is satisfied if one < 1 and all other \bj\ < l(j ^ i). That is, under (3.2) 
a switching autoregressive model is stable as long as it has a positive probability of 
being in a stable regime while all other regimes are either stationary or random-walk 
type nonstationary. Obviously, explosive behavior (|&4 > 1) in some regimes is also 
allowed as long as (3.3) is satisfied. □ 

The conclusion of Example 2 in the one-dimensional case, though benign and 
reasonable, cannot be extended to multi-dimensional case, except in trivial cases such 
as Lemma 3 when B^s are either triangular or commutable. Initially, we thought that 
the mixture of two stable processes is always stable. This turns out not to be true in 
the multidimensional case. A counterexample (Example 3) is given to show that two 
stable subprocesses can be mixed to produce a unstable switching process. On the 
other hand, two unstable subprocesses can be mixed to produce a stable switching 
process (Example 4). 

3.2 Calculating Lyapunov exponents in a nontrivial case 

For Example 3, we need a result on an explicit formula for Lyapunov exponent in 
a nontrivial case due to Pincus (1985), see Lima and Rahibe (1994). Consider the 
case r = 2 and two 2x2 real matrices B\ and B 2 , where B x is singular. Denote 
the transition probability matrix of {/„} by P(I n = j\I n -i — i) — Pij,i,j = 1,2 and 
initial distribution P(7 = i) — Pi,i — 1, 2. 
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By change of basis, we can assume that B\ takes on the form 

5 ^ 
v 

(Another form of A 



B l 



J 



( 
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is not interesting because B\ = 0.) 
We write B 2 in the form 



B n 2 = 



' &n(n) b 12 (n) X 



\ b 21 (n) b 22 (n) J 

then a result due to Pincus (1985) and Lima and Rahibe (1994) says that the Lya- 
punov exponent is given by 



A = 



P21 



lo § H + log |6ni 



n 



(3.4) 



P21 + P12 i= i 
In the case that B 2 is singular, we consider the case that 

, ( h \ 
B 2 = Q- 1 \ Q 

v 

where Q is an invertible matrix. (By a simple argument, in the other case B 2 

Q^ 1 I ] Q, we have A = —00. Not what we want.) 





Then, from Lima and Rahibe (3.2), 
A 



P21 + P12 



Example 3. Consider Bi 



P21 , us. . P12 1 1 r 1 1 P12P21 , , hi 
log |d| + z —— log |d 2 | + z — r~~ lo S 



P21 + P12 

5i 




P12+P21 * l Tr(B 2 ) 1 ' 
( 



(3.5) 



and B 2 



b\ —cb\ 



\ b 2 



-cbo 



. The eigenvalues 



for B 2 are and S 2 = b x — cb 2 . The Lyapunov exponent is given by 

\ P 21 1 1 j: 1 1 P12 , ir t I . P12P21 , , &i 
A = log |di| H log 1 61 — cb 2 1 H log I 



P21 + P12 



P12 + P21 



P12 + P21 h - cb 2 



(3.6) 
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We want to choose bi,b 2 ,c, 8\ and Pi/s so that \Si\ < 1, \5 2 \ < 1 and A > 0. Since 
the first two terms in (3.6) are negative, we need to make the third term as large as 
possible. Thus, &i/(&i — cb 2 ) should be large. For example, if we choose 

bi = 100, c = 10, b 2 = 9.99, 5 = 0.1. Then in order A > we require 

-P21 log \6i\+ P12 log 10 < 3p 21 p 12 log 10. 

This is satisfied if e.g. 5i = 0.1, p 2 i = P12 = 0.8. □ 

If one subprocess is stable, the other is unstable, in most situations there always 
exists a switching strategy to make the mixing process stable. Consider the situation 
that there exists a subordinate matrix norm such that \\Bi\\ < 1, \\B 2 \\ > 1. Then, 
Slog 1 1 Ai| I = p! log 1 1 .BiH + P2 log 1 1 -B2 1 1 can be made less than if p 2 is small enough. 
We call this strategy the preferred switching, to denote the phenomenon that a mixture 
process with less frequent unstable regime can still be stable. 

Now we give an example that two unstable vector processes can give rise to a 
stable mixing process. 

Example 4- Consider an extension of Example 2 to multidimensional case when 
-B/'s commute. For example, let 



The two Lyapunov exponents associated with the switching between B\ and B 2 are 
given by 

Ai =pi log 2 -p 2 log 3, A 2 = -pi log 2 + p 2 (log 3 -log 2). 
We require that Ai < and A2 < 0. Let p — p\. This is true if and only if 




log 3 -log 2 
log 3 



< P < 



log 2 + log 3 



log 3 



□ 
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3.3 Mean shifting models 

Consider the mean shifting model given by 

X n = M n + A n X n ^ + E n (3.7) 

where A n and E n as before and M n is the shifting mean, defined as fa when I n — i 
for % = 1, 2, . . . , r or M n = £Li /^l {7n= ; } . 

The mean-shifting model can be regarded as a more general case of SAR when E n 
may be allowed to take nonzero mean as well, such as, fa when I n = % for some i. An 
interesting case is when A n is a constant and only the mean or variance of E n varies 
among different regimes. 

Obviously M n is a stationary sequence if I n is. Using an expansion similar to 
(2.14) and Proposition 1, it can be shown that the proper stationary solution of (3.7) 
is given by 

oo oo 

W' n = (M n + Y, A nK-i ■ ■ ■ A^Mn-i-x) + (E n + J2 A n A n _ x ■ ■ ■ A^E^.J. (3.8) 

i=0 i=0 

That is, the stationary solution of (3.7) is given by the sum of two stationary processes 

oo 

M n = M n + J2 AAn-l ■ ■ ■ K-iMn-i-! (3.9) 
i=0 

and W n of (2.11). Note that (3.9) is in general well-defined under the negative Lya- 
punov exponent assumption [cf. (2.9)] and 

Emax(log ||Mi||,0) < oo 

(cf. Proof of Lemma 1). In particular, above condition is satisfied if M n takes on 
values from a finite set. 

Example 5. Hamilton (1989) 's model for business cycle uses a fourth-order autore- 
gression and mean-shifting model with two regimes. Writing in our state space rep- 
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0\ 0,2 0,3 O4. 



1 







resentation (3.7), this corresponds to A n taking on a fixed A 



10 



\ 1 J 



and M n taking on (/ij,0,0,0) T depending on I n — i,i — 1,2. By our theory, this 
model has a stationary and stable solution as long as A is stable. In particular, 
the empirical model of Krolzig (1997, Sec. 11.3.1) for German business cycle with 
ai = 0.2932, a 2 = 0.1055, a 3 = 0.0026, a 4 = 0.3812 clearly has a stationary solution 
because a± + a 2 + 03 + 04 < 1 and Oj's are positive. 

Example 6. We discuss another mean-shifting model which is a simplified version 
of Lu and Berliner (1997) 's model for a riverflow time series y n . Their model consists 
of mixture of AR(1), ARX(l), and AR(1) models with different means at each of the 
three regimes, corresponding to normal (0), rising (1), or falling (2) of the riverflow, 
where in the rising regime the past rainfall x n _\ series is included linearly. We assume 
here that the regime switching process is independent of both {x n }, {y n } and follows a 
Markov chain. This model can be easily embedded in our formulation (3.7) with p — 1 
and M n taking on fixed values except in the rising regime when M n = + ax n -\- 
Extending slightly the argument used in this section, if the rainfall series {x n } is 
stationary and the regime switching process is ergodic, the riverflow series {y n } is 
stationary if the AR(1) processes are either stationary or nonstationary of the random 
walk type (cf. Example 2). 

4 Existence of moments 

Existence of moments is often assumed in time series analysis, notably for the second- 
order theory (cf. Brockwell and Davis, 1991). For a general stochastic difference 
equation, Karlsen (1990) gives some general conditions for checking the existence 
of finite moments. He also gives some examples where more explicit results can 
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be derived. In this section, by exploiting the Markovain structure in the hidden 
state process, we derive directly some explicit conditions for existence of second-order 
moment of SVAR models and the related autocorrelation property. 

We make the following assumption. 
(A) lim n _ +oo E[|K...,A 1 |||/ = 1} =0 for any i = l,...,R. 

By the ergodicity of {I n }, one can easily show that (A) is equivalent to the 
condition that 

(A') lirn_ oo E|L4 n ...,A 1 ||=0. 

Consider the property of the quantity defined by 

$ ni (I i ) = E[\\A n ...,A i+1 \\\I i ] 

for any n,i < n. Then, since {A n } is an induced matrix-valued FMC defined in terms 
of I n . It shares the usual Markov property, and in particular $ ni (/j) is independent 
of % and depends only on n — i. If we write 

$,(/ ) = E[p £ ...,A 1 |||/ ] 

then 

We use <Ev or $ ni to denote their unconditional analogues. 

We have the following proposition on $^(/ ). 
Proposition 2 

®n(Io) if and only if $ n (I ) tends to geometrically. 

Proof: Since <J> n (io) uniformly over I . Then, there exist an integer I and 
constant 7 < 1 such that < 7 for all i. 

*2e(I )<E[$ e (I e )\\A e ...A 1 \\\I ] 
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< l^fih) < 7 2 

There exists a constant C such that $ n < C^ n ^ for any n. That is, tends to 
at a geometric rate. The sufficient part of the proof is easy to establish. □ 

Theorem 2 The SVAR process has a stationary solution whose second-order moment 
exists if (A) is satisfied. 

Proof: Consider the expansion for SVAR in (2.1): 

X n — A n . . . A\Xq + A n . . . A2E1 + ■ • • + A n E n -i + E n . 

Then, 

E\\X n \\ < E\\A n ...A 1 \\-\\X \\+E\\A n ...A 2 \\-\\E 1 \\ 
+ --- + E||A n || • ||£ n -i||+E||£ n || 
= E{E[||A, . . . AxlH/o] • ||X ||} + E{E[||A n . . . A 2 \\h}} • \\X h e nh \\} 

+ • • • + E{E[||A»|||/n-i] • HS/^n-i/^J} + E||K|| 
< max$ n (i)E||X || + max $ n _i(i) •E||£'i|| 
+ --- + max$i(i) • EHK-ill +E||E n || 

which is convergent if E||X || < 00, by Proposition 2 and ergodicity of {I n }- Here 
assumptions on {E n } and independence of {/„} and {e n i} are used. □ 

Note that, by the concave nature of logX, the Jensen's Inequality implies that 
the strict inequality 

E\og\\A n ...A 1 \\<\ogE\\A n ...A 1 \\ (4.1) 

holds. 

We denote limsup n ^ 00 (l/n) logE||A n . . . Ai\\ by log(7). Condition (A) is equiva- 
lent to 7 < 1. By (4.1), this further implies that 

A<log7<0. (4.2) 
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This indicates that condition (A) or (A') is stronger than negativity of the largest 
Lyapunov exponent A, a potentially general condition for strict stationarity. However, 
the latter does not even ensure existence of second-order moment, see Bougerol and 
Picard for an example in the case of an GARCH process. 

Using the fact that 

X n + m = A n+m . . . A n+ iX n + A n+m . . . A n+ zE n+ i + . . . A n+m E n+m _i + E n+m 

for any integers m and n, we have 

|EXjX n+m | = |EX^ ' A n+m . . . A n+ iX n \ 

< E| < X n , A n+m . . . A n+ \X n > | 
^ E||A n+m . . . A n+1 || • ||X n || 2 . 

That is, 

\EX^X n+m \ ^^EH^II 2 (4.3) 

where we use the property that {X n } is causal and stationary, and {A n+i } is station- 
ary. Thus, the autocovariance matrix at lag m of the vector time series {X n } decays 
at a geometric rate, and is bounded by 7 m . 

5 Switching ARMA models 

We note some extensions of the switching autoregressive models. First, a switching 
moving average process of order q (SMA(q)) can be defined as 

X n = E n + C\ n E n _i + C 2n E n -2 + • • • + C qn E n _ q (5.1) 

where {E n } is defined as before, and E n _j = Y7 i= i ^{n-j)d{i n =i} for j = 1, 2, . . . , q. 
The coefficient matrices Cj n will take on member of a set of r matrices depending on 
the value of /„ for each j between 1 and q. 

We also assume that {(e„i, . . . , e nr ) T } is stationary as before. If {/„} is stationary, 
it follows that {X n } is stationary since it is a moving average function of stationary 
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processes. On the other hand, if {/„} is ergodic, for arbitrary starting regime, {I n } 
eventually converges to stationarity and thus {X n } is asymptotically stationary. 

Similar to ARMA models, one can define switching ARMA (SARMA) models in 
which the coefficient matrices in both AR part and MA part take on different values 
depending on the current regime. The stationarity condition for SVAR(l) models is 
also sufficient for SARMA(l,q) models. Since an AR(p) process can be represented 
as a vector AR(1) process, our theory applies to any switching ARMA(p,q) process. 

Other extension is also possible. In particular, the transition probabilities of 
switching may be allowed to depend on past values of the process, or past values 
of another process. This interesting class of nonlinear time series models is closely 
related to some traditional state dependent nonlinear time series models (cf. Tong 
1990). Not surprisingly, there are increasing interest in applying them in some real 
modelling situations such as security time series and high-frequency data. It is our 
hope that the present work may shed light on these more complex models. 
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